Skip to content

Update CI#3114

Merged
zjjlivein merged 11 commits intoPaddlePaddle:developfrom
swgu98:test
Dec 10, 2025
Merged

Update CI#3114
zjjlivein merged 11 commits intoPaddlePaddle:developfrom
swgu98:test

Conversation

@swgu98
Copy link
Copy Markdown
Member

@swgu98 swgu98 commented Dec 5, 2025

Before submitting

  • Lint code. If there are lint issues, please format the code first.
# Install and register `pre-commit` in the project folder
pip install pre-commit && pre-commit install

# Process previous code files separately
pre-commit run --file XXXX.py
  • Add test cases into tests folder. If there are codecov issues, please add tests cases first.

PR types

PR changes

Description

@paddle-bot
Copy link
Copy Markdown

paddle-bot Bot commented Dec 5, 2025

Thanks for your contribution!

@swgu98 swgu98 changed the title test Update CI Dec 10, 2025
@zjjlivein zjjlivein merged commit a25fd8d into PaddlePaddle:develop Dec 10, 2025
5 checks passed
@swgu98 swgu98 deleted the test branch December 10, 2025 07:22
Ace-To-HYB pushed a commit to Ace-To-HYB/PaddleFormers that referenced this pull request Feb 27, 2026
…1060)

## 算子目录

- [1. 转换算子](PaddlePaddle#1-转换算子)
  - [1.1 llava转换算子](PaddlePaddle#11-llava转换算子)
    - [1.1.1 llava_convert](PaddlePaddle#111-llava_convert)
- [2. 过滤算子](PaddlePaddle#2-过滤算子)
  - [2.1 基础过滤算子](PaddlePaddle#21-基础过滤算子)
    - [2.1.1 valid_data_filter](PaddlePaddle#211-valid_data_filter)
- [2.1.1.1 image_compliance_operator](PaddlePaddle#2111-image_compliance_operator)
- [2.1.1.2
conversation_compliance_operator](PaddlePaddle#2112-conversation_compliance_operator)
  - [2.2 文本过滤算子](PaddlePaddle#22-文本过滤算子)
- [2.2.1 conversation_length_filter](PaddlePaddle#221-conversation_length_filter)
- [2.2.2 average_line_length_filter](PaddlePaddle#222-average_line_length_filter)
- [2.2.3 maximum_line_length_filter](PaddlePaddle#223-maximum_line_length_filter)
- [2.2.4
conversation_percentage_filter](PaddlePaddle#224-conversation_percentage_filter)
    - [2.2.5 token_num_filter](PaddlePaddle#225-token_num_filter)
    - [2.2.6 alphanumeric_ratio_filter](PaddlePaddle#226-alphanumeric_ratio_filter)
    - [2.2.7 stopwords_ratio_filter](PaddlePaddle#227-stopwords_ratio_filter)
    - [2.2.8 special_characters_filter](PaddlePaddle#228-special_characters_filter)
    - [2.2.9 language_id_filter](PaddlePaddle#229-language_id_filter)
    - [2.2.10 text_action_filter](PaddlePaddle#2210-text_action_filter)
- [2.2.11
text_entity_dependency_filter](PaddlePaddle#2211-text_entity_dependency_filter)
- [2.2.12
char_ngram_repetition_filter](PaddlePaddle#2212-char_ngram_repetition_filter)
- [2.2.13
word_ngram_repetition_filter](PaddlePaddle#2213-word_ngram_repetition_filter)
    - [2.2.14 conversation_hash_filter](PaddlePaddle#2214-conversation_hash_filter)
- [2.2.14.1
simhash_duplicate_operator](#22141-simhash_duplicate_operator)
- [2.2.14.2
minhash_duplicate_operator](#22142-minhash_duplicate_operator)
    - [2.2.15 llm_judge_filter](PaddlePaddle#2215-llm_judge_filter)
  - [2.3 图像过滤算子](PaddlePaddle#23-图像过滤算子)
    - [2.3.1 image_filesize_filter](PaddlePaddle#231-image_filesize_filter)
    - [2.3.2 image_ration_filter](PaddlePaddle#232-image_ration_filter)
    - [2.3.3 image_resolution_filter](PaddlePaddle#233-image_resolution_filter)
    - [2.3.4 image_hash_filter](PaddlePaddle#234-image_hash_filter)
  - [2.4 图文过滤算子](PaddlePaddle#24-图文过滤算子)
    - [2.4.1 image_clip_filter](PaddlePaddle#241-image_clip_filter)
- [3. 分析算子](PaddlePaddle#3-分析算子)
  - [3.1 基础分析算子](PaddlePaddle#31-基础分析算子)
    - [3.1.1 base_analysis_pipeline](PaddlePaddle#311-base_analysis_pipeline)
- [3.1.1.1 analyze_dataset_statistics](PaddlePaddle#3111-analyze_dataset_statistics)
- [3.1.1.2
analyze_language_distribution](PaddlePaddle#3112-analyze_language_distribution)
      - [3.1.1.3 analyze_image_paths](PaddlePaddle#3113-analyze_image_paths)
      - [3.1.1.4 analyze_data_anomalies](PaddlePaddle#3114-analyze_data_anomalies)
- [3.1.1.5
analyze_conversation_tokens](PaddlePaddle#3115-analyze_conversation_tokens)
  - [3.2 进阶分析算子](PaddlePaddle#32-进阶分析算子)
    - [3.2.1 description_analysis](PaddlePaddle#321-description_analysis)
    - [3.2.2 quality_analysis](PaddlePaddle#322-quality_analysis)
- [4. 可视化算子](PaddlePaddle#4-可视化算子)
  - [4.1 lda可视化算子](PaddlePaddle#41-lda可视化算子)
    - [4.1.1 lda_topic_clustering](PaddlePaddle#411-lda_topic_clustering)
- [5. 生成算子](PaddlePaddle#5-生成算子)
  - [5.1 多模态生成算子](PaddlePaddle#51-多模态生成算子)
    - [5.1.1 generate_qna_for_images](PaddlePaddle#511-generate_qna_for_images)



--- 
- PaddlePaddle/PaddleMIX#1055
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants